Cache Transaction Ids #454

jameinel · 2017-06-15T14:18:46Z

This could be landed in v2-unstable, but it was built off of v2, which apparently has many commits that aren't in v2-unstable.

When dealing with records that have many transactions in their txn-queue, one of the top consumers of CPU is actually the conversion from Token to ObjectId. The primary culprit of this was hasPreReq, which walks all of the tokens in a document and compares the object ID to some other object ID, and essentially converts every Token to an ID multiple times.

This is part of a series of small fixes I've put together around making handling large TXN queues a little bit nicer. It includes some test cases that expose where the problems are, and the patches to make those test cases a bit nicer.

The first commit is just running 'go fmt' which apparently reorders a few imports.

The second introduces the tests, so that you can get an impression of baseline performance. You can see the performance by running:
cd mgo.v2/txn
go test -check.vv -check.f TxnQueue -qlength 100 # 200, 400, 800, 1600, etc

You can see the O(N^2) performance. I've put together a spreadsheet showing the changes with various patches. https://docs.google.com/spreadsheets/d/1OXcIMooO8fG3xzLmn_4WSECLDdXBjQ-cKzkk26hovNU/edit?usp=sharing

The key indicators on the graphs for this patch are the Blue Circles and the Red Triangles.
This patch has the biggest impact on Setup Prepared, ResumeAll Prepared, Setup Preparing and ResumeAll preparing (less of an impact on AssertionGrowth).

The algorithm is still fundamentally O(N^2) because we are re-reading the document from the database N times and it has N transaction tokens on the document. And the time to convert from BSON into Go objects is a significant portion of the time. Converting from Token to ObjectId was the other significant portion of the time, which is what this patch addresses.

I chose to break up the work into smaller performance tweaks so we can discuss them reasonably independently.

When walking graphs (hasPreReq), we can actually spend a lot of time doing the conversion from a 'hex+nonce' token string back to a binary ObjectId. Cache them in the flusher.

Add 2 more patches to update mgo. ## Description of change The patch is a simple caching of conversion from txn token (which is id + nonce) into id. We always need the ID at least once, and in some of the algorithms we do a graph search and end up hitting the token many times. This actually started to dominate CPU time when the graphs got large. The cache is pretty trivial and expires along with the tokens, so it shouldn't be much of a burden. ## QA steps No real visible changes on functioning systems. ## Documentation changes None. ## Bug reference go-mgo/mgo#454

jameinel added 3 commits June 6, 2017 13:55

run 'go fmt' using go 1.8

20e84f3

Add the test cases that show O(N^2) performance

2eb5d1c

Cache conversion from token to TXN ObjectId.

9428095

When walking graphs (hasPreReq), we can actually spend a lot of time doing the conversion from a 'hex+nonce' token string back to a binary ObjectId. Cache them in the flusher.

domodwyer mentioned this pull request Jun 15, 2017

Cache TXN ID globalsign/mgo#9

Closed

jameinel closed this Nov 7, 2017

jameinel reopened this Nov 7, 2017

jameinel mentioned this pull request Nov 7, 2017

Add 2 more patches to update mgo. juju/juju#8032

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache Transaction Ids #454

Cache Transaction Ids #454

jameinel commented Jun 15, 2017 •

edited

Loading

Cache Transaction Ids #454

Are you sure you want to change the base?

Cache Transaction Ids #454

Conversation

jameinel commented Jun 15, 2017 • edited Loading

jameinel commented Jun 15, 2017 •

edited

Loading